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Abstract 

We consider the noise complexity of differentially private mechanisms in the setting where 
the user asks d linear queries / : Sft" — > 3J non-adaptively. Here, the database is represented by a 
vector in 9t n and proximity between databases is measured in the £i-metric. 

We show that the noise complexity is determined by two geometric parameters associated 
with the set of queries. We use this connection to give tight upper and lower bounds on the 
noise complexity for any d ^ n. We show that for d random linear queries of sensitivity 1, it is 
necessary and sufficient to add ^-error Q (ram{dVd / s , dy / log(n/d)/e}) to achieve e-diffcrcntial 
privacy. Assuming the truth of a deep conjecture from convex geometry, known as the Hypcrplane 
conjecture, we can extend our results to arbitrary linear queries giving nearly matching upper 
and lower bounds. 

Our bound translates to error 0(min{rf/e, ^ d\og{n / d) / e\) per answer. The best previous 
upper bound (Laplacian mechanism) gives a bound of 0(mm{d/e, y/n/e}) per answer, while 
the best known lower bound was £l(Vd/e). In contrast, our lower bound is strong enough to 
separate the concept of differential privacy from the notion of approximate differential privacy 
where an upper bound of 0(Vd/e) can be achieved. 
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1 Introduction 



The problem of Privacy-preserving data analysis has attracted a lot of attention in recent years. 
Several databases, e.g. those held by the Census Bureau, contain private data provided by individuals, 
and protecting the privacy of those individuals is an important concern. Differential Privacy is a 
rigorous notion of privacy that allows statistical analysis of sensitive data while providing strong 
privacy guarantees even in the presence of an adversary armed with arbitrary auxiliary information. 
We refer the reader to the survey of Dwork [Dwo08] and the references therein for further motivation 
and background information. 

We consider the following general setting: A database is represented by a vector x G 3f? n . The 
queries that the analyst may ask are linear combinations of the entries of x. More precisely, a 
multidimensional query is a map F: ffl 1 — > and we will restrict ourselves to linear maps F 
with coefficients in the interval [—1, 1]. Thus F is a d x n matrix with entries in [—1, 1]. In this 
work, we assume throughout that d ^ n. A mechanism is a randomized algorithm which holds a 
database x G 5i n , receives a query F: 3ft n — ► and answers with some a G Informally, we say a 
mechanism satisfies differential privacy in this setting if the densities of the output distributions 
on inputs x,x' G 9?" with \\x — x'\\i ^ 1 are point wise within an exp(e) multiplicative factor of 
each other. Here and in the following, e > is a parameter that measures the strength of the 
privacy guarantee (smaller e being a stronger guarantee). The error of a mechanism is the expected 
Euclidean distance between the correct answer Fx and the actual answer a. 

In this work, we use methods from convex geometry to determine a nearly optimal trade-off 
between privacy and error. We will see a lower bound on how much error any differentially private 
mechanism must add. And we present a mechanism whose error nearly matches this lower bound. 

As mentioned, the above setup is fairly general. To illustrate it and facilitate comparison with 
previous work, we will describe some specific instantiations below. 

Histograms. Suppose we have a database y G [n] N , containing private information about N 
individuals. We can think of each individual as belonging to one of n types. The database y can 
then naturally be translated to a histogram x G 3? n , i.e., x% counts the number of individuals of 
type i. Note that in the definition of differential privacy, we require the mechanism to be defined 
for all and demand that the output distributions be close whenever [|sc — x'\\i ^ 1. This 

is a stronger requirement than asserting this property only for integer vectors x and x' . It only 
makes our upper bounds stronger. For the lower bounds, this strengthening allows us to ignore the 
discretization issues that would arise in the usual definition. However, our lower bounds can be 
extended for the usual definition for small enough e and large enough N (see Appendix B). Now, 
our upper bound holds for any linear query on the histogram. This includes some well-studied and 
natural classes of queries. For instance, contingency tables (see, e.g., [BCD + 07]) are linear queries 
on the histogram. 

Private bits. In the setting looked at by Dinur and Nissim [DN03], the database y G {0,1}^ 
consists of one private bit for each individual and each query ask for the number of l's amongst a 
(random) subset on [N]. Given d such queries, one can define n ^ 2 d types of individuals, depending 
on the subset of the queries that ask about an individual. The vector y then maps to a histogram x 
in the natural way with x% denoting the number of individuals of type i with their private bit set 
to 1. Our results then imply a lower bound of VL{d/e) per answer for any e-differentially private 
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mechanism. This improves on the Q(y/d) bound for d = N from [DN03] for a weaker privacy 
definition (blatant non- privacy) . A closely related rephrasing is to imagine each individual having d 
private {0, 1} attributes so that n = 2 d . The d queries that ask for the 1-way marginals of the input 
naturally map to a matrix F and Theorem 1.1 implies a lower bound of il(d/e) noise per marginal 
for such queries. 

One can also look at x itself as a database where each individuals private data is in [0, 1] ; in this 
setting the dimension of the data n equals the number of individuals N. Our results lead to better 
upper bounds for this setting. 

Finally there are settings such as the recent work of [MM09] on private recommendation systems, 
where the private data is transformed with a stability guarantee so that nearby databases get 
mapped to vectors at l\ distance at most 1. 

1.1 Our results 

We relate the noise complexity of differentially private mechanisms to some geometric properties of 
the image of the unit ^i-ball, denoted B™, when applying the linear mapping F. We will denote the 
resulting convex polytope by K = FB™. Our first result lower bounds the noise any e-differentially 
private mechanism must add in terms of the volume of K. 

Theorem 1.1. Let e > and suppose F: ffl 1 — > K d is a linear map. Then, every e-private 
mechanism M has error at least Sl^dy/d ■ Vol^) 1 ^) where K = FB^. 

Recall, the term error refers to the expected Euclidean distance between the output of the 
mechanism and the correct answer to the query F. 

We then describe a differentially private mechanism whose error depends on the expected £2 
norm of a randomly chosen point in K. Our mechanism is an instantiation of the exponential 
mechanism [MT07] with the score function defined by the (negative of the) norm || • \\k, that is 
the norm which has K as its unit ball. Hence, we will refer to this mechanism as the fT-norm 
mechanism. Note that as the definition of this norm depends on the query F, so does the output of 
our mechanism. 

Theorem 1.2. Let e > and suppose F: ffl 1 —* is a linear map with K = FBf. Then, the 
K-norm mechanism is e-differentially private and has error at most 0(e~ 1 d¥, Z £K IHb)- 

As it turns out, when F is a random Bernoulli ±1 matrix our upper bound matches the lower 
bound up to constant factors. In this case, K is a random polytope and its volume and average 
Euclidean norm have been determined rather recently. Specifically, we apply a volume lower bound 
of Litvak et al. [LPRN05], and an upper bound on the average Euclidean norm due to Klartag and 
Kozma [KK09]. Quantitatively, we obtain the following theorem. 

Theorem 1.3. Let e > and d ^ n/2. Then, for almost all matrices F € {— 1, l} rfxn ; 

1. any e-differentially private mechanism M has error fl(d/e) ■ mm{y/d, y / log(n/<i)}. 

2. the K-norm mechanism is e-differentially private with error O(dfe) ■ min{\/d, ^log(n/d)}. 

We remark that Litvak et al. also give an explicit construction of a mapping F realizing the 
lower bound. 

More generally, we can relate our upper and lower bounds whenever the body K is in approx- 
imately isotropic position. Informally, this condition implies that E 2g x ||z|| ~ \fd ■ Yo\(K) l > d LK- 
Here, Lk denotes the so-called isotropic constant which is defined in Section 6. 
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Figure 1: Summary of results in comparison to best previous work for d random linear queries each 
of sensitivity 1 where 1 ^ d ^ n. Note that informally the average per coordinate error is smaller than the 
stated bounds by a factor of Vd. Here, (e, <5)-differential privacy refers to a weaker approximate notion of pricacy 
introduced later. Our lower bound does not apply to this notion. 

Theorem 1.4. Let e > and suppose F : W 1 — > is a linear map such that K = FB™ is in 
approximately isotropic position. Then, the K-norm mechanism is e- differentially private with error 
at most 0(e~ 1 d\fd ■ Vol(AT where Lk denotes the isotropic constant of K. 

Notice that the bound in the previous theorem differs from the lower bound by a factor of Lk- 
A central conjecture in convex geometry, sometimes referred to as the "Hyperplane Conjecture" or 
"Slicing Conjecture" (see [KK09] for further information) states that Lk = 0(1). 

Unfortunately, in general the polytope K could be very far from isotropic. In this case, both 
our volume-based lower bound and the A-norm mechanism can be quite far from optimal. We give 
a recursive variant of our mechanism and a natural generalization of our volume-based lower bound 
which are nearly optimal even if A is non-isotropic. 

Theorem 1.5. Let e > 0. Suppose F: ffl 1 — > K d is a linear map. Further, assume the Hyperplane 
Conjecture. Then, the mechanism introduced in Section 7 is e- differentially private and has error at 
most 0(log 3 / 2 d) • GVolLB(A', e). where GVolLB(A, e) is a lower bound on the error of the optimal 
e- differentially private mechanism. 

While we restricted our theorems to F G [— 1, l] rfxn , they apply more generally to any linear 
mapping F. 

Efficient Mechanisms. Our mechanism is an instantiation of the exponential mechanism and 
involves sampling random points from rather general high-dimensional convex bodies. This is why 
our mechanism is not efficient as it is. However, we can use rapidly mixing geometric random walks 
for the sampling step. These random walks turn out to approach the uniform distribution in a 
metric that is strong enough for our purposes. It will follow that both of our mechanisms can be 
implemented in polynomial time. 

Theorem 1.6. The mechanisms given in Theorem 1.2 and Theorem 1.5 can be implemented in 
time polynomial in n,l/e such that the stated error bound remains the same up to constant factors, 
and the mechanism achieves e- differential privacy. 

We note that our lower bound GVolLB can also be approximated up to a constant factor. 
Together these results give polynomial time computable upper and lower bounds on the error of any 
differentially private mechanism, that are always within an 0(log 3 ^ 2 d) of each other. 

Figure 1.1 summarizes our results. Note that we state our bounds in terms of the total £2 error, 
which informally is a ^fd factor larger than the average per coordinate error. 
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1.2 Previous Work 



Queries of the kind described above have (total) sensitivity d, and hence the work of Dwork et 
al. [DMNS06] shows that adding Laplace noise with parameter d/e to each entry of Fx ensures 
e-differential privacy. Moreover, adding Laplace noise to the histogram x itself leads to another 
private mechanism. Thus such questions can be answered with noise min(d/e, T,/n/£,N) per entry 
of Fx. Some specific classes of queries can be answered with smaller error. Nissim, Raskhodnikova 
and Smith [NRS07] show that one can add noise proportional to a smoothed version of the local 
sensitivity of the query, which can be much smaller than the global sensitivity for some non-linear 
queries. Blum, Ligett and Roth [BLR08] show that it is possible to release approximate counts for all 
concepts in a concept class C on {0, l} m with error 0((iV 2 mVCDim(C)/e)3), where VCDim(C) is 
the VC dimension of the concept class. Their bounds are incomparable to ours, and in particular their 
improvements over the Laplacian mechanism kick in when the number of queries is larger than the 
size of the database (a range of parameters we do not consider). Feldman et al. [FFKN09] construct 
private core sets for the /c-median problem, enabling approximate computation of the fc-median cost 
of any set of k facilities in Private mechanisms with small error, for other classes of queries have 
also been studied in several other works, see e.g. [BDMN05, BCD+07, MT07, CM08, GLM+10]. 

Dinur and Nissim [DN03] initiated the study of lower bounds on the amount of noise private 
mechanisms must add. They showed that any private mechanism that answers O(N) random subset 
sum queries about a set of N people each having a private bit must add noise Q(y/N) to avoid nearly 
full disclosure of the database (blatant non-privacy). This implies that as one answers more and 
more questions, the amount of error needed per answer must grow to provide any kind of privacy 
guarantee. These results were strengthened by Dwork, McSherry and Talwar [DMT07], and by 
Dwork and Yekhanin [DY08]. However all these lower bounds protect against blatant non-privacy 
and cannot go beyond noise larger than min per answer, for d queries. Kasiviswanathan, 

Rudelson and Smith [KRS09] show lower bounds of the same nature (mm(\/d, y/N) for d questions) 
for a more natural and useful class of questions. Their lower bounds also apply to (e, <5)-differential 
privacy and are tight when e and 5 are constant. For the case of d = 1, Ghosh, Roughgarden 
and Sundararajan [GRS09] show that adding Laplace noise is in fact optimal in a very general 
decision-theoretic framework, for any symmetric decreasing loss function. For the case that all 
sum queries need to be answered (i.e. all queries of the form fp(y) = Y2i=i P(Vi) where P is a 
0-1 predicate), Dwork et al. [DMNS06] show that any differentially private mechanism must add 
noise Q(N). Rastogi et al. [RSH07] show that half of such queries must have error £l(y/~N). Blum, 
Ligett and Roth [BLR08] show that any differentially private mechanism answering all (real- valued) 
halfspace queries must add noise U(N). 

1.3 Overview and organization of the paper 

In this section we will give a broad overview of our proof and outline the remainder of the paper. 

Section 2 contains some preliminary facts and definitions. Specifically, we describe a linear 
program that defines the optimal mechanism for any set of queries. This linear program (also 
studied in [GRS09] for the one-dimensional case) is exponential in size, but in principle, given any 
query and error function, can be used to compute the best mechanism for the given set of queries. 
Moreover, dual solutions to this linear program can be used to prove lower bounds on the error. 
However, the asymptotic behavior of the optimum value of these programs for multi- dimensional 
queries was not understood prior to this work. Our lower bounds can be reinterpreted as dual 
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solutions to the linear program. The upper bounds give near optimal primal solutions. Also, our 
results lead to a polynomial-time approximation algorithm for the optimum when F is linear. 

We prove our lower bound in Section 3. Given a query F: — > 9? rf , our lower bound depends on 
the (i-dimensional volume of K = FBf. If the volume of K is large, then a packing argument shows 
that we can pack exponentially many points inside K so that each pair of points is far from each 
other. We then scale up K by a suitable factor A. By linearity, all points within \K have preimages 
under F that are still A-close in £i-distance. Hence, the definition of e-differential privacy (by 
transitivity) enforces some constraint between these preimages. We can combine these observations 
so as to show that any differentially private mechanism M will have to put significant probability 
mass in exponentially many disjoint balls. This forces the mechanism to have large expected error. 

We then introduce the if-norm mechanism in Section 4. Our mechanism computes Fx and 
then adds a noise vector to Fx. The key point here is that the noise vector is not independent of 
F as in previous works. Instead, informally speaking, the noise is tailored to the exact shape of 
K = FBf. This is accomplished by picking a particular noise vector a with probability proportional 
to exp(—e\\Fx — cl\\k)- Here, || • \\k denotes the (Minkowski) norm defined by K. While our 
mechanism depends upon the query F, it does not depend on the particular database x. We can 
analyze our mechanism in terms of the expected Euclidean distance from the origin of a random 
point in K, i.e., 'KzeK \\z\\2- Arguing optimality of our mechanism hence boils down to relating 
^zgk || z || 2 to the volume of K which is the goal of the next section. 

Indeed, using several results from convex geometry, we observe that our lower and upper bounds 
match up to constant factors when F is drawn at random from { — 1, l} rfxri . As it turns out the 
polytope K can be interpreted as the symmetric convex hull of the row vectors of F. When F is a 
random matrix, if is a well-studied random polytope. Some recent results on random polytopes give 
us suitable lower bounds on the volume and upper bounds on the average Euclidean norm. More 
generally, our bounds are tight whenever K is in isotropic position (as pointed out in Section 6). 
This condition intuitively gives a relation between volume and average distance from the origin. 
Our bounds are actually only tight up to a factor of Lk, the isotropic constant of K. A well-known 
conjecture from convex geometry, known as the Hyperplane Conjecture or Slicing Conjecture, implies 
that L K = O(l). 

The problem is that when F is not drawn at random, K could be very far from isotropic. In 
this case, the if-norm mechanism by itself might actually perform poorly. We thus give a recursive 
variant of the if-norm mechanism in Section 7 which can handle non-isotropic bodies. Our approach 
is based on analyzing the covariance matrix of K in order to partition K into parts on which our 
earlier mechanism performs well. Assuming the Hyperplane conjecture, we derive bounds on the 
error of our mechanism that are optimal to within poly logarithmic factors. 

The costly step in both of our mechanisms is sampling uniformly from high-dimensional convex 
bodies such as K = FB™. To implement the sampling step efficiently, we will use geometric random 
walks. It can be shown that these random walks approach the uniform distribution over K in 
polynomial time. We will actually need convergence bounds in the relative ^oo-metric, a metric strong 
enough to entail guarantees about exact differential privacy rather than approximate differential 
privacy (to be introduced later). 

Some complications arise, since we need to repeat the privacy and optimality analysis of our 
mechanisms in the presence of approximation errors (such as an approximate covariance matrix and 
an approximate separation oracle for K). The details can be found in Section 8. 
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2 Preliminaries 

Notation. We will write B d to denote the unit ball of the p-norm in 9ft rf . When K C 3fJ d is 
a centrally symmetric convex set, we write || • \\k for the (Minkowski) norm defined by K (i.e. 
\\x\\k = inf{r: x G rK}). The £ p -norms are denoted by || • || p , but we use || • || as a shorthand for 
the Euclidean norm || • 1 1 2 . Given a function F : %t dl — > ?ft d2 and a set K G Sft* 1 , FF" denotes the 
set {F(x) :i£4 

2.1 Differential Privacy 

Definition 2.1. A mechanism M is a family of probability measures M = {fj, x : x G where each 
measure /i x is defined on 3? d . A mechanism is called e- differentially private, if for all x, y G 3? n such 
that \\x — y\\i ^ 1, we have sup^^d ^jfy ^ exp(e), where the supremum runs over all measurable 

subsets S C 

A common weakening of e-differential privacy is the following notion of approximate privacy. 

Definition 2.2. A mechanism is called <5-approximate e- differentially private, if for all x,y G ffi 1 
such that fi x (S) ^ exp(e)fj iy (S) + 5 for all measurable subsets 5* C 3fJ n whenever||x — y\\i ^ 1, 

The definition of privacy is transitive in the following sense. 

Fact 2.3. If M is an e- differentially private mechanism and x,y G 3ft" satisfy \\x — y\\i ^ k, then 
for measurable S C R d we have ^[gj ^ exp(efc). 

Definition 2.4 (Error). Let F: 5R d and ^ x We define the terror of a 

mechanism M as err^(M, F) = sup^^n E^^ ^(a, Fx). Unless otherwise specified, we take i to be 
the Euclidean norm £2. 

Definition 2.5 (Sensitivity). We will consider mappings F which possess the Lipschitz property, 
sup^g^n || Fx ||i ^ d. In this case, we will say that F has sensitivity d. 

Our goal is to show trade-offs between privacy and error. The following standard upper bound, 
usually called the Laplacian mechanism, is known. 

Theorem 2.6 ([DMNS06]). For any mapping F: 3? n — > 3? rf of sensitivity d and any e > 0, there 
exists an e- differentially private mechanism M with err(M, F) = 0{d^fd/e). 

When it comes to approximate privacy, the so-called Gaussian mechanism provides the following 
guarantee. 

Theorem 2.7 ([DKM + 06]). Let e, 5 > 0. Then, for any mapping F: ffl 1 — > $l d of sensitivity d there 
exists a 8-approximate e- differentially private mechanism M with err(M, F) = 0(d \/log( 1/5) /e). 
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2.2 Isotropic Position 



Definition 2.8 (Isotropic Position). We say a convex body K C 5ft rf is in isotropic position with 
isotropic constant Lk if for every unit vector v £ 5ft d , 

^/j^^)| 2 ^ = ^Vol(^. (1) 

Fact 2.9. For every convex body K C 5ft d , i/iere is a volume-preserving linear transformation T 
such that TK is in isotropic position. 

For an arbitrary convex body K, its isotropic constant Lx can then be defined to be Ltk where 
T brings L to isotropic position. It is known (e.g. [MP89]) that T is unique up to an orthogonal 
transformation and thus this is well-defined. 

We refer the reader to the paper of Milman and Pajor [MP89], as well as the extensive survey of 
Giannopoulos [Gia03] for a proof of this fact and other facts regarding the isotropic constant. 



2.3 Gamma Distribution 

The Gamma distribution with shape parameter k > and scale 9 > 0, denoted Gamma(fc, 9), is 
given by the probability density function 



f(r;k,9)=r k 



T(k)9 



k ■ 



Here, T(k) = J e r r k 1 dr denotes the Gamma function. We will need an expression for the moments 
of the Gamma distribution. 

Fact 2.10. Let r ~ Gamma(M)- Then, 



E\r r 



l T(k + m) 



(2) 



Proof. 



E\r r ' 



fc+m-l e 



-r/0 



T(k)9 k 



dr 



r(k)d k 

r(k + m)9 k+ ' 
T(k)9 k 



f+m-i e -r d0r 



T(k + m)9 r ' 



2.4 Linear Programming Characterization 

Suppose that the set of databases is given by some set T>, and let dist : T> x T> — » 5Ro be a distance 
function on T>. A query q is specified by an error function err : T> x 7Z — > 5ft. For example T> could be 
the Hamming cube {0, 1}^ with dist being the Hamming distance. Given a query F: {0, 1}^ — > 5ft d , 
the error function could be err(j;, a) = \\a — F(x)\\2 if we wish to compute F(x) up to a small £2 
error. 



8 



A mechanism is specified by a distribution fi x on 1Z for every x G D. Assume for simplicity that 
T> and 1Z are both finite. Thus a mechanism is fully defined by real numbers fj,(x,a), where /i(x, a) 
is the probability that the mechanism outputs answer a G 7Z on databases x G T>. The constraints 
on /x for an ^-differentially private mechanism are given by 



The expected error (under any given prior over databases) is then a linear function of the 
variables /i(x, a) and can be optimized. Similarly, the worse case (over databases) expected error 
can be minimized, and we will concentrate on this measure for the rest of the paper. However these 
linear programs can be prohibitive in size. Moreover, it is not a priori clear how one can use this 
formulation to understand the asymptotic behavior of the error of the optimum mechanism. 

Our work leads to a constant approximation to the optimum of this linear program when F is a 
random in {— l,+l} dxn and an 0(log 3//2 (f)-approximation otherwise. 

3 Lower bounds via volume estimates 

In this section we show that lower bounds on the volume of the convex body FB™ C give rise to 
lower bounds on the error that any private mechanism must have with respect to F. 

Definition 3.1. A set of points Y C 5R d is called a r-packing if \\y — y'\\2 *S r for any y, y' £Y,yj^ y'. 

Fact 3.2. Let K C 3f? d such that R = Vol^K) 1 ^ . Then, K contains an Q(R\fd) -packing of size at 
least exp(d). 

Proof. Since Vol(i?|) 1 / d ~ the body K has the volume of a ball of radius r G Q(R\fd). Any 
maximal ^-packing then has the desired property. ■ 

Theorem 3.3. Let e > and suppose F : ffl 1 — > is a linear map and let K = FB™- Then, every 
e- differentially private mechanism M must have err(M,F) ^ fl(e~ 1 dy/d ■ Xo\(K) l l d ). 

Proof. Let A ^ 1 be some scalar and put R = \o\(K) l l d . By Fact 3.2 and our assumption, 
XK = AFB™ contains an ^(A-Ry^-packing Y of size at least exp(d). Let X C W 1 be a set of 
arbitrarily chosen preimages of y G Y so that |X| = \Y\ and FX = Y. By linearity, \FB™ = F(XBf) 
and hence we may assume that every x G X satisfies ||x||i ^ A. 

We will now assume that M = {fi x : x £ $t n } is an e-differentially private mechanism with error 
cdyfdRje and lead this to a contradiction for small enough c > 0. For this we set A = d/2e. By the 
assumption on the error, Markov's inequality implies that for all x G X, we have ^ X {B X ) *S \, where 
B x is a ball of radius 2cdVdR/e = 4cXRVd centered at Fx. Since Y = FX is an J7(Ai?V / d)-packing, 
the balls {B x : x G X} are disjoint for small enough constant c > 0. 

Since ||x||i ^ A, it follows from e-differential privacy with Fact 2.3 that 




Vx G V 



/j,(x, a) ^ 

/i(x,a) ^ exp(sdist(x, x'))[i(x', a) 



Vx eV,aeTZ 
Vx, x G T>, a G 1Z 



^o(-Bx) > exjp(-eX)/j, x (B x ) ^ ^exp(-d/2). 
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Since the balls B x are pairwise disjoint, 

1 ^ Ha(U xe xB x ) = M B x) > exp(d)±exp(-d/2) > 1 (3) 

for d ^ 2. We have thus obtained a contradiction. ■ 

We denote by VolLB(F, e)) the lower bound resulting from the above theorem. In other words 

VolLB(F,e) = e^dVd ■ \o\{FB^) l / d . 

Thus any e-differentially private mechanism must add noise r2(VolLB(i^, e)). We will later need 
the following modification of the previous argument which gives a lower bound in the case where 
K is close to a lower dimensional subspace and hence the volume inside this subspace may give a 
stronger lower bound. 

Corollary 3.4. Let e > and suppose F: 3f? n — > 5R d is a linear map and let K = FB™. Furthermore, 
let P denote the orthogonal projection operator of a k-dimensional subspace of 3f? d for some 1 ^ k ^ d. 
Then, every e- differentially private mechanism M must have 

err(M, F) ^ Q{e' x kVk ■ \o\ k (PK) l/k ). (4) 

Proof. Note that a differentially private answer a to F can be projected down to a (differentially 
private) answer Pa to PF and P is norm 1 operator. ■ 

We will denote by GVolLB(F, e) the best lower bound obtainable in this manner, i.e., 

GVolLB(F,e) = supe"^^ • Vol fc (PF^) 1 / fc 

k,P 

where the supremum is taken over all k and all /c-dimensional orthogonal projections P. 



Lower bounds in the Hamming metric. Our lower bound used the fact that the mechanism is 
defined on all vectors x € In Appendix B, we show how the lower bound can be extended when 
restricting the domain of the mechanism to integer vectors x S [-/V] n , where distance is measured in 
the Hamming metric. 

3.1 Lower bounds for small number of queries 

As shown previously, the task of proving lower bounds on the error of private mechanisms reduces 
to analyzing the volume of FB™. When d ^ logra this is a straightforward task. 

Fact 3.5. Let d ^ logra. Then, for all matrices F £ [-1, l] dxn , Vol(FB^) 1 / d ^ O(l). Furthermore, 
there is an explicit matrix F such that FB™ has maximum volume. 

Proof. Clearly, FB™ is always contained in B^ and Yo^B^) 1 ^ = 2. On the other hand, since 
n ^ 2 d , we may take F to contain all points of the hypercube H = {±l} d as its columns. In this 
case, FBI ^ B t>- ■ 

This lower bound shows that the standard upper bound from Theorem 2.6 is, in fact, optimal 
when d ^ log(n). 
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KM(F, d, e) : 

1. Sample z uniformly at random from K = FB™ and sample r ~ Gamma(c?+ l,e ) 

2. Output Fx + rz. 



Figure 2: Description of the d-dimensional -fT-norm mechanism. 

4 The A'-norm mechanism 

In this section we describe a new differentially private mechanism, which we call the fT-norm 
mechanism. 

Definition 4.1 (X-norm mechanism). Given a linear map F: 3f? n — > and e > 0, we let -ftT = FS™ 
and define the mechanism KM(F, ci, e) = {/i^: £ S 5? n } so that each measure [i x is given by the 
probability density function 

/(a) = Z' 1 exp(-e\\Fx - a\\ K ) (5) 
defined over 3? d . Here Z denotes the normalization constant 

Z= exp(-e\\Fx - a\\ K )da = T(d+l)\ol(e~ 1 K). 
JSt d 

A more concrete view of the mechanism is provided by Figure 4 and justified in the next remark. 

Remark 4.2. We can sample from the distribution \x x as follows: 

1. Sample r from the Gamma distribution with parameter d+1 and scale e , denoted Gamma(d+ 
l,e _1 ). That is, r is distributed as 



oc 



2. Sample a uniformly from Fx + rK. 

Indeed, if \\a — -Fx||/<- = R, then the distribution of a as above follows the probability density 
function 

- 1 r e~ £ H d _ Jg° e~ £t dt _ e~ £R 

9( - Q > ~ e- d T(d + l) J R V6L(tK) * ~ T(d + ^Vol^" 1 ^) ~ r(d + ljVolCe- 1 ^) ' ( ' 

which is in agreement with (5). That is, g(a) = /(a). 

The next theorem shows that the fT-norm mechanism is indeed differentially private. Moreover, 
we can express its error in terms of the expected distance from the origin of a random point in K. 

Theorem 4.3. Let e > 0. Suppose F: W 1 -> 3f? d is a linear map and put K = FBf. Then, the 
mechanism KM(F,d,e) is e- differentially private, and for every p > achieves the error bound 
^ a ^fi x \\Fx — a\\ p ^ r ^p||^ ^zeK ll^Hf- I n particular, the l^-error is at most d ^ E 2g ^ 1 1 1 1 2 • 
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Proof. To argue the error bound, we will follow Remark 4.2. Let D = Gamma(d + 1, For 
all x G 3T, 



E \\Fx-a\\ p 



E \\a\\ p 



E E ||a|| 



E r p 



T(d+l+p) 
ePT(d +1) «ek 



E 



(by Fact (2.10)) 



Privacy follows from the fact that the mechanism is a special case of the exponential mecha- 
nism [MT07]. For completeness, we repeat the argument. 

Suppose that ^ 1. It suffices to show that for all a G the densities of an d fJ, x are 
within multiplicative exp(e), i.e., 



Z -l e ~e\\a\\ K 
Z-l e -e\\Fx-a\l 



e e(||Fx'-a|| A .-||a|| A -) < e e\\Fx\\ K < & e _ 



where in the first inequality we used the triangle inequality for || • \\k- In the second step we used 
that x G B™ and hence Fx G FB™ = K which means ||i^a;||x ^ 1- 

Hence, the mechanism satisfies e-differential privacy. ■ 



5 Matching bounds for random queries 

In this section, we will show that our upper bound matches our lower bound when F is a random query. 
A key observation is that FBf is the symmetric convex hull of n (random) points {v\, . . . , v n } C 3fJ d , 
i.e., the convex hull of {±vi, ■ • • , ±Vn}, where Vi G K rf is the ith column of F. The symmetric convex 
hull of random points has been studied extensively in the theory of random polytopes. A recent 
result of Litvak, Pajor, Rudelson and Tomczak-Jaegermann [LPRN05] gives the following lower 
bound on the volume of the convex hull. 

Theorem 5.1 ([LPRN05]). Let 2d ^ n $S 2 d and let F denote a random d x n Bernoulli matrix. 
Then, 

Vol(FBf) 1 / d ^ n(l) v / log(n/d)/d, (7) 

with probability 1— exp(— Q,{d^n l ~^)) for any [3 G (0, |). Furthermore, there is an explicit construction 
of n points in {—1, l} d whose convex hull achieves the same volume. 

We are mostly interested in the range where n» d log d in which case the theorem was already 
proved by Giannopoulos and Hartzoulaki [GH02] (up to a weaker bound in the probability and 
without the explicit construction). 

The bound in (7) is tight up to constant factors. A well known result [BF88] shows that the 
volume of the convex hull of any n points on the sphere in 3f? rf of radius \fd is bounded by 

Vo\{Kf/ d < 0(l)Vlog(n/d)/d. (8) 

Notice, that in our case K = FBI ^ B L^ \fdB d and in fact the vertices of K are points on the 
(d— l)-dimensional sphere of radius yd. However, equation (7) states that the normalized volume of 
the random polytope K will be proportional to the volume of the Euclidean ball of radius \f\og{n/d) 
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rather than yd. When d 3> log n, this means that the volume of K will be tiny compared to the 
volume of the infinity ball B^. By combining the volume lower bound with Theorem 3.3, we get 
the following lower bound on the error of private mechanisms. 

Theorem 5.2. Let e > and < d ^ n/2. Then, for almost all matrices F G {—1, l} dxn ; every 
e- differentially private mechanism M must have 

err(M, F) ^ n{d/e) ■ min jv^, y/log(n/d)} . (9) 
5.1 A separation result. 

We use this paragraph to point out that our lower bound immediately implies a separation between 
approximate and exact differential privacy. 

Theorem 2.7 gives a mechanism providing ^-approximate e-differential privacy with error 
o(e _1 y/\og(n/d)) as long as 5 ^ l/n°^\ Our lower bound in Theorem 5.2 on the other hand 
states that the error of any e-differentially private mechanism must be ^(e" 1 \f\og(n/d)) (assuming 
d 3> log(n)). We get the strongest separation when d ^ log(ra) and 5 is constant. In this case, our 
lower bound is a factor \fd larger than the upper bound for approximate differential privacy. 



5.2 Upper bound on average Euclidean norm 

Klartag and Kozma [KK09] recently gave a bound on the quantity H, z ^k when K = FB™ for 
random F. 

Theorem 5.3 ([KK09]). Let F be a random d x n Bernoulli matrix and put K = FB™. Then, 
there is a constant C > so that with probability greater than 1 — Ce~°( n \ 



1 



\\z\\ 2 dz < C\og(n/d). (10) 

zei< 



Vol(K) 

An application of Jensen's inequality thus gives us the following corollary. 

Corollary 5.4. Let e > and < d ^ n/2. Then, for almost all matrices F e {—1, l} dxn , the 
mechanism KM(F, d, e) is e-differentially private with error at most 

0{d/e) ■ min {v^, y/log(n/d)} . (11) 



6 Approximately isotropic bodies 

The following definition is a relaxation of nearly isotropic position used in literature (e.g., [KLS97]) 

Definition 6.1 (Approximately Isotropic Position). We say a convex body K C !R d is in c- 

approximately isotropic position if for every unit vector v G $l d , 



\(z,v)\ 2 dz^c 2 L 2 K Vol(K)- d . (12) 
Vol(K) J K 
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The results of Klartag and Kozma [KK09] referred to in the previous section show that the 
symmetric convex hull n random points from the d-dimensional hypercube are in 0(l)-approximately 
isotropic position and have Lk = 0(1)- More generally, the iT-norm mechanism can be shown to 
be approximately optimal whenever K is nearly isotropic. 

Theorem 6.2 (Theorem 1.2 restated). Let e > 0. Suppose F: 5i n — > 3ft rf is a linear map such that 
K = FB™ is in c- approximately isotropic position. Then, the K-norm mechanism is e- differentially 
private and has error at most O(cLk) ■ V61LB(.F, e). 

Proof. By Theorem 4.3, the K-noim mechanism is e- differentially private and has error 
~E z ~k \\ z \\- By the definition of the approximately isotropic position, we have: K z ^k \\ z \\ 2 ^ 
d ■ c 2 L 2 K Yo\(K) 2 / d . By Jensen's inequality, 

E lldl sC ^±1 / E ||z||2 ^ Oie^dVd-VoliKy^cLK). 

e z~k e y z~k" 

Plugging in the definition of VolLB proves the result. ■ 

We can see that the previous upper bound is tight up to a factor of cLk- Estimating Lk for 
general convex bodies is a well-known open problem in convex geometry. The best known upper 
bound for a general convex body K C 3fJ rf is Lk ^ 0(d 1 ^ 4 ) due to Klartag [Kla06], improving over 
the estimate Lk ^ 0{d 1 ^ log d) of Bourgain from '91. The conjecture is that Lk = O(l). 

Conjecture 6.3 (Hyperplane Conjecture). There exists C > such that for every d and every 
convex set K C 3^, Lk < C. 

Assuming this conjecture we get matching bounds for approximately isotropic convex bodies. 

Theorem 6.4. Let e > 0. Assuming the hyperplane conjecture, for every F G [— l,l] rfxn such 
that K = FB™ is c- approximately isotropic, the K-norm mechanism KM(F, d, e) is e- differentially 
private with error at most 

O(c) • VolLB(F, e) ^ 0{cd/e) ■ min { Vd, y/log(n/d)} . (13) 

7 Non-isotropic bodies 

While the mechanism of the previous sections is near-optimal for near- isotropic queries, it can be far 
from optimal if K is far from isotropic. For example, suppose the matrix F has random entries from 
{+1, —1} in the first row, and (say) from {^-, — ^} in the remaining rows. While the Laplacian 
mechanism will add O(^) noise to the first co-ordinate of Fx, the if-norm mechanism will add noise 
O(dfe) to the first co-ordinate. Moreover, the volume lower bound VolLB is at most 0(e~ 1 y/d). 
Rotating F by a random rotation gives, w.h.p., a query for which the Laplacian mechanism adds £2 
error 0{d/e). For such a body, the Laplacian and the ii'-norm mechanisms, as well as the VolLB 
are far from optimal. 

In this section, we will design a recursive mechanism that can handle such non-isotropic convex 
bodies. To this end, we will need to introduce a few more notions from convex geometry. 

Suppose K C is a centered convex body, i.e. J K xdx = 0. The covariance matrix of K, 
denoted Mk is the d x d matrix with entry ij equal to My = Vol ^^ J K XiXjdx. That is, Mk is the 
covariance matrix of the uniform distribution over K. 
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NIM(F, d, e) : 

1. Let K = FB'[ l . Let a\ a% . . . <t^ denote the eigenvalues of the covariance 
matrix Mr. Pick a corresponding orthonormal eigenbasis U\, . . . ,Ud- 

2. Let d' = \d/2\ and let f7 = spanjui, . . . , u^} and V = span{ud'+i, ■ • • , Vd}- 

3. Sample a ~ KM(F, d, e) . 

4. If d = 1, output Pyo. Otherwise, output NIM(P;yF, d', e) + Pya . 



Figure 3: Mechanism for non-isotropic bodies 
7.1 A recursive mechanism 

Having defined the covariance matrix, we can describe a recursive mechanism for the case when K 
is not in isotropic position. The idea of the mechanism is to act differently on different eigenspaces 
of the covariance matrix. Specifically, the mechanism will use a lower- dimensional version of 
KM(F,(I',£) on subspaces corresponding to few large eigenvalues. 

Our mechanism, called NIM(i ? , d, e), is given a linear mapping F: K n — > 3i d , and parameters 
d £ N, e > 0. The mechanism proceeds recursively by partitioning the convex body K into two 
parts defined by the middle eigenvalue of Mr. On one part it will act according to the if- norm 
mechanism. On the other part, it will descend recursively. The mechanism is described in Figure 7.1 

Remark 7.1. The image of PjjF above is a (f -dimensional subspace of We assume that in 
the recursive call NIM(i- > [/i ? , d' , e), the if -norm mechanism is applied to a basis of this subspace. 
However, formally the output is a ci-dimensional vector. 

To analyze our mechanism, first observe that the recursive calls terminate after at most log d 
steps. For each recursive step m € {0, ... , logd}, let a m denote the distribution over the output of 
the if m -norm mechanism in step 3. Here, K m denotes the d m -dimensional body given in step m. 

Lemma 7.2. The mechanism NIM(i ? , d, e) satisfies (e log d)- differential privacy. 

Proof. We claim that for every step m £ {0, . . . , logd}, the distribution over a m is e-differentially 
private. Notice that this claim implies the lemma, since the joint distribution of ao, ai, • • • , a m is 
e log(d)-differentially private. In particular, this is true for the final output of the mechanism as it 
is a function of ao, . . . , a m . 

To see why the claim is true, observe that each K m is the <i m -dimensional image of the £i-ball 
under a linear mapping. Hence, the JT m -norm mechanism guarantees ^-differential privacy by 
Theorem 4.3. ■ 

The error analysis of our mechanism requires more work. In particular, we need to understand 
how the volume of P\jK compares to the norm of Pya. As a first step we will analyze the volume 
of P V K. 
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7.2 Volume in eigenspaces of the covariance matrix 

Our goal in this section is to express the volume of K in eigenspaces of the covariance matrix 
in terms of the eigenvalues of the covariance matrix. This will be needed in the analysis of our 
mechanism for non-isotropic bodies. 

We start with a formula for the volume of central sections of isotropic bodies. This result can 
be found in [MP89]. 

Proposition 7.3. Let K C 5R d be an isotropic body of unit volume. Let E denote a k- dimensional 
subspace for 1 k ^ d. Then, 



Vo\ k (E n k) 1 '^-^ = e 




Here, Bx is an explicitly defined isotropic convex body. 

From here on, for an isotropic body K, let ax = £1(Lb k /Lk) be a lower bound on Vol k (E n 
j^y/(d-k) ixnplied by the above proposition. For a non-isotropic K, let ax be otk when T is the 
map the brings K into isotropic position. Notice that if the Hyperplane Conjecture is true, then 
ax = fi(l). Moreover, ax is 0(d?) due to the results of [Kla06]. 

Corollary 7.4. Let K C be an isotropic body with Vol(K) = 1. Let E denote a k-dimensional 
subspace for 1 ^ k ^ d and let P denote an orthogonal projection operator onto the subspace E. 
Then, 

Vol fe (PK) 1/(c( - fc) ^ a K . 

Proof. Observe that the PK contains E D K since P is the identity on E. ■ 

We cannot immediately use these results since they only apply to isotropic bodies and we 
are specifically dealing with non-isotropic bodies. The trick is to apply the previous results after 
transforming K into an isotropic body while keeping track how much this transformation changed 
the volume. 

As a first step, the following lemma relates the volume of projections of an arbitrary convex 
body K to the volume of projections of TK for some linear mapping T. 

Lemma 7.5. Let K C sft d be a symmetric convex body. Let T be a linear map which has eigenvectors 
iti, . . . , ltd with eigenvalues Ai, . . . , A^. Let 1 ^ k ^ d and suppose E = span{«i, U2, • • • , u k }, Denote 
by P be the projection operator onto the subspace E. Then, 

k 

\ol k (PK) > \ol k (PTK) UK 1 - 

i=\ 

Proof. For simplicity, we assume that the eigenvectors of T are the standard basis vectors ei, . . . , e^; 
this is easily achieved by applying a rotation to K. Now, it is easy to verify that P = PT~ 1 T = SPT 
where S = diag^ 1 , A 2 . . . , A k \ 0, . . . , 0). Thus we can write 

Vol k (PK) = det(S lE )Yol k (PTK) = -^—Yol k (PT K) . ■ 
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Before we can finish our discussion, we will need the fact that the isotropic constant of K can 
be expressed in terms of the determinant of Mr ■ 

Fact 7.6 ([Gia03, MP89]). Let K Q$t d be a convex body of unit volume. Then, 

L 2 K Yol(K)i =det(M K ) 1 / d . (14) 
Moreover, K is in isotropic position iff Mk = L K Yo\(K) 2 l d L . 
We conclude with the following Proposition 7.7. 

Proposition 7.7. Let K C 3f? d be a symmetric convex body. Let M k have eigenvectors u\, . . . ,Ud 
with eigenvalues a±, . . . , a^. Let 1 k ^ [~|] with and suppose E = span{ui, 112, ■ ■ ■ , u k }, Denote by 
P be the projection operator onto the subspace E. Then, 

l/(d-k) 



' (l5) 



where ax is 0(1/^4 ). Moreover, assuming the Hyperplane conjecture, olk 0(1). 

— 1/2 

Proof. Consider the linear mapping T = M K . this is well defined since Mk is a positive symmetric 
matrix. It is easy to see that after applying T, we have Mtk = L. Hence, by Fact 7.6, TK is in 
isotropic position and has volume \o\{TK) l / d = 1/Ltk = ^/Lk, since det(M^x) = 1- Scaling TK 

lid — - — - — - 

by A = L^ hence results in \o\{XTK) = 1. Noting that AT has eigenvalues Xa l 2 , Xa 2 2 , . • • , Xa d 2 , 
we can apply Lemma 7.5 and get 

k r~- 

Vol k (PK) > Vol^PATiT) J] ^ 

i=l 

Since XTK is in isotropic position and has unit volume, Corollary 7.4 implies that 

\o\ k {PXTK) l / (d - k ^ > a K . (16) 

Thus the required inequality holds with an additional A d ~ k term. By assumption on k, -rqr is at 
most 2. Moreover, A = LH d d}/ d ^ 2, so that this additional term is a constant. As discussed 

above, ax is n(d~4) by [Kla06], and 0(1) assuming the Hyperplane Conjecture 6.3. Hence the 
claim. ■ 

7.3 Arguing near optimality of our mechanism 

Our next lemma shows that the expected squared Euclidean error added by our algorithm in each 
step is bounded by the square of the optimum. We will first need the following fact. 

Fact 7.8. Let K Q$l d be a centered convex body. Let o\ ^ 02 ^ ■ • • ^ o~d denote the eigenvalues of 
Mk with a corresponding orthonormal eigenbasis u\,... ,Ud- Then, for all 1 ^ % ^ d, 

o~i = max E (6,x) 2 (17) 
e xeK 

where the maximum runs over all 8 E § d ~ l such that 9 is orthogonal to 111,112, ■ ■ ■ , Ut—i. 
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Lemma 7.9. Let a denote the random variable returned by the K-norm mechanism in step (3) in 
the above description of NIM(F, d, e). Then, 

GVolLB(F, e) 2 > £l{a 2 K ) E \\P v a\\l . 

Proof. For simplicity, we will assume that d is even and hence d — d' = d' . The analysis of the 
if-norm mechanism (Theorem 4.3 with p = 2) shows that the random variable a returned by the 
K-norm mechanism in step (3) satisfies 

FIIP nil 2 - r(d + 3) ^ ( d + 2 )( d + 1 ) F II p 7 U2 



v 7 i=d'+l 

d 



(J ) Yl ai ( by Fact 71 



=d'+l 



^ O ( 5 ) • oVi'+i- (18) 



On the other hand, by the definition of GVolLB, 



GVolLB(F,e) 2 > ft ■Yo\ d ,{P u K) 2 l d ' 

3 / d ' \ 
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Since s3 h follows that 

GVolLB(F, e) 2 > «(o^) E ||Pya|| 2 . (19) 

The case of odd ci is similar except that we define K' to be the projection onto the first d' + 1 
eigenvectors. ■ 

Lemma 7.10. Assume the hyperplane conjecture. Then, the t2-error of the mechanism NIM(F, d, e) 
satisfies 

err(NIM,F) < 0(y/log(d) ■ GVolLB(F, e)). 

Proof. We have to sum up the error over all recursive calls of the mechanism. To this end, let 
Py m a m denote the output of the iT-norm mechanism a m in step m projected to the corresponding 
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subspace V m . Also, let a G 5R d denote the final output of our mechanism. We then have, 



E || a || 2 ^ a/ IE 1 1 a 1 1 2 (Jensen's inequality) 



log d 



\ ^E||Py m a m ||2 
\ m=l 



logd 



. ^ 0(a^? ) • GVolLB(F, e) 2 (by Lemma 7.10) 

\ m=l 



< O(Jfogd) (max a" 1 ) GVolLB(F,e). 

Here we have used the fact that GVolLB(F, e) > GVolLB(P)yF, e). Finally, the hyperplane conjecture 
implies max m a^ =0(1). ■ 

Corollary 7.11. Let e > 0. Suppose F:$i n is a linear map. Further, assume the hyperplane 

conjecture. Then, there is an e- differentially private mechanism M with error 

err(M,F) < 0{log(d) 3/2 • GVolLB(F, e)). 

Proof. The mechanism NIM(i* 1 , d, ej log(d)) satisfies e-differential privacy, by Lemma 7.2. The error 
is at most log(d) \J\og d ■ GVolLB(F, e) as a direct consequence of Lemma 7.10. ■ 

Thus our lower bound GVolLB and the mechanism NIM are both within 0(log 3 ^ 2 d) of the 
optimum. 

8 Efficient implementation of our mechanism 

We will first describe how to implement our mechanism in the case where K is isotropic. Recall that 
we first sample R ~ Gamma(d, e~ l ) and then sample a point a uniformly at random from RK. The 
first step poses no difficulty. Indeed when U\, . . . , Ud are independently distributed uniformly over 
the interval (0, 1], then a standard fact tells us that e _1 Y2i=i ~^ n {Ui) ~ Gamma(d, e _1 ). Sampling 
uniformly from K on the other hand may be hard. However, there are ways of sampling nearly 
uniform points from K using various types of rapidly mixing random walks. In this section, we will 
use the Grid Walk for simplicity even though there are more efficient walks that will work for us. 
We refer the reader to the survey of Vempala [Vem05] or the original paper of Dyer, Frieze and 
Kannan [DFK91] for a description of the Grid walk and background information. Informally, the 
Grid walk samples nearly uniformly from a grid inside K, i.e., CD K where we take C = ^Z d . The 
Grid Walk poses two requirements on K: 

1. Membership in K can be decided efficiently 

2. K is bounded, in the sense that B$CKC dB d . 

Both conditions are naturally satisfied in our case where K = FBf for some F G [— l,l] rfxra . 
Indeed, K C C y/dB$ and we may always assume that B d Q K, for instance, by considering 
K' = K + B d rather than K. This will only increase the noise level by 1 in Euclidean distance. 
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Notice that K' is convex. In order to implement the membership oracle for K, we need to be 
able to decide for a given a G ?R. d , whether there exists an x 6 Bf such that Fx = a. These 
constraints can be encoded using a linear program. In the case of K' this can be done using convex 
programming [GLS94]. 

The mixing time of the Grid walk is usually quantified in terms of the total variation (or L\) 
distance between the random walk and its stationary distribution. The stationary distribution of 
the grid Walk is the uniform distribution over CD K. Standard arguments show that an Li-bound 
gives us (5-approximate e-differential privacy where 5 can be made exponentially small in polynomial 
time. In order to get exact privacy (6 = 0) we instead need a multiplicative guarantee on the density 
of the random walk at each point in K. 

In Appendix A, we show that the Grid Walk actually satisfies mixing bounds in the relative 
Loo-metric which gives us the following theorem. We also need to take care of the fact that the 
stationary distribution is a priori not uniform over K. A solution to this problem is shown in the 
appendix as well. 

Theorem 8.1. Let Pt denote the Grid Walk over K at time step t. Given a linear mapping 
F: 9? n — > $t d and x 6 ffl 1 , consider the mechanism M' which samples R ~ Gamma(d + l,e _1 ) and 
then outputs Ra where a ~ Pt. Then, there is some t ^ poly(<i, e -1 ) such that 

1. M' is 0{e)- differentially private, 

2. err(M',i ? ) = en(M,F) + 0(1), where M denotes the K-norm mechanism. 

We conclude that the Grid walk gives us an efficient implementation of our mechanism which 
achieves the same error bound (up to constants) and e-differential privacy. 

Remark 8.2. The runtime stated in Theorem 8.1 depends only upon d and e . The polynomial 
dependence on n only comes in when implementing the separation oracle for K as described earlier. 
Since we think of d as small compared to n, the exact runtime of our algorithm heavily depends 
upon how efficiently we can implement the separation oracle. 

8.1 When K is not isotropic 

In the non-isotropic case we additionally need to compute the subspaces U and V to project onto 
(Step 2 of the algorithm). Note that these subspaces themselves depend only on the query F and 
not on the database x. Thus these can be published and the mechanism maintains its privacy for an 
arbitrary choice of subspaces U and V. The choice of U, V in Section 7 depended on the covariance 
matrix M, which we do not know how to compute exactly. We next describe a method to choose 
U and V that is efficient such that the resulting mechanism has essentially the same error. The 
sampling from K can then be replaced by approximate sampling as in the previous subsection, 
resulting in a polynomial-time differentially private mechanism with small error. 

Without loss of generality, K has the property that B d Q K C d^B^. In this ^ d 4 

so that with 0(d 4 log d) (approximately uniform) samples from K, Chernoff bounds imply that 
the sample covariance matrix approximates the covariance matrix well. In other words, we can 
construct a matrix Mk such that each entry of Mk is within neg(d) of the corresponding entry in 
Mk- Here and in the rest of the section, neg{d) denotes an negligible function bounded above by 

4j for a large enough constant C, where the constant may vary from one use to the next. Let the 

i 

eigenvalues of M be u\, . . . , o& with corresponding eigenvectors u\, . . . , U4. Let T be the M K 2 , and 
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let P be the projection operator onto the span of the first d! eigenvectors of Mk- This defines our 
subspaces U and V, and hence the mechanism. We next argue that Lemma 7.10 continues to hold. 
First note that for any i ^ d! + 1 

E (a,Ui) 2 = \\vT MkuAI 

= \\ujM K Ui\\ + \\uJ{M K - M K )ui\\ 
= di + neg(d). 

Thus, Equation 18 continues to hold with dd'+i replacing c<f'+l- 

To prove that Proposition 7.7 continues to hold (with M,T,P replacing M,T,P), we note that 
the only place in the proof that we used that Mk is in fact the covariance matrix of K is (16), 
when we require TK to be isotropic. We next argue that (16) holds for TK if Mk is a good enough 
approximation to Mk- This would imply Proposition 7.7 and hence the result. 

First recall that Wedin's theorem [Wed72] states that for non-singular matrices R, R, 

WR^-R^h ^ 1 - Rh • maxjHir 1 ^, ||-R~ X |||} . 

Using this for the matrices M5 , Ms and using standard perturbation bounds gives (see e.g. [KM08]): 

\\f - T\\ 2 sC 0(1)||T||! • ||M| - m\\\ 2 . (20) 

Since ||T||2 is at most poly(d) and the second term is neg(d), we conclude that \\T — T\\ 2 is neg(d). 
It follows that TK C TK + neg(d)P>2. Moreover, since TK is in isotropic position, it contains a 
ball \P 2 - It follows from Lemma C.l in the appendix that is contained in TK. Thus, 

(1 - i) TK C (1 - i) TK + neg{d)B d 2 
C (1 - \) TK + neg(d)TK 
C TK, 

where the last containment follows from the fact that TK is convex and contains the origin. Thus 
(1 - \)PTK C PTK. Since Corollary 3.4 still lower bounds the volume of PTK, we conclude that 

d-k 

1 k 

YoUPTK) 1 ^ > -\o\ k (PTK) l / k > , 

e e 

where we have used the fact that k ^ d so that (1 — ^) fc ^ For k = d' , is 0(1) so 
that \o\k{PTK) 1 ^ d ^^ ^ Q(o.k)- Thus we have shown that up to constants, (16) holds for 
\o\k{PTK) 1 ^ d ~ k ^ which completes the proof. 

9 Generalizations of our mechanism 

Previously, we studied linear mappings F: 3? n — > 9ft rf where 3? n was endowed with the ^i-metric. 
However, the fT-norm mechanism is well-defined in a much more general context. The only property 
of K used here is its convexity. In general, let T> be an arbitrary domain of databases with a distance 
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function dist. Given a function F : T> —>■ K , we could define Kq = {(F(x) — F(x')) / dist(x, x') : 
x, x' £ T>} and let K be the convex closure of Kq. Then the i^T-norm mechanism can be seen to 
be differentially private with respect to dist. Indeed note that that \q(d,a) — q(d',a)\ = \F(d) — 
cl\k — \F(d') — cl\k ^ \F(d) — F(d')\K ^ dist(d,d'), and thus privacy follows from the exponential 
mechanism. 

Moreover, in cases when one does not have a good handle on K itself, one can use any convex 
body K 1 containing K. 

Databases close in ^2- norm - For example, McSherry and Mironov [MM09] can transform their 
input data set so that neighboring databases map to points within Euclidean distance at most R for 
a suitable parameter R. Thus dist here is the £2 norm and for any linear query, K is an ellipsoid. 

Local Sensitivity. Nissim, Raskhodnikova and Smith [NRS07] define smooth sensitivity and show 
that one can design approximately differentially private mechanism that add noise proportional to 
the smooth sensitivity of the query. This can be significant improvement when the local sensitivity 
is much smaller than the global sensitivity. Notice that such queries are necessarily non- linear. We 
point out that one can define a local sensitivity analogue of the if-norm mechanism by considering 
the polytopes K x = conv j F dist(ofx')^ : x ' e ^} ano - adapting the techniques of [NRS07] accordingly. 
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A Mixing times of the Grid Walk in 

In this section, we sketch the proof of Theorem 8.1. We will be interested in the mixing properties 
of Markov chains over some measured state space 0. We will need to compare probability measures 
(j,, v over the space O. 

The relative Loo-distance is defined as 

1[|oo = sup 



dfi(u) 



dv(u) 



(21) 
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For a Markov chain P, we will be interested in the mixing time in the oo-metric. That is the 
smallest number t such that ||Pt/vr — l\\oo ^ £■ Here, Pt is the distribution of P at step t and tt 
denotes the stationary distribution of P. The relevance of the oo-norm for our purposes is given by 
the following fact. 

Lemma A.l. Suppose M = {/XrjzeSR™ is an e- differentially private mechanism M and suppose 
M' = {fJ-' x }xelH n satisfies max{|| \i x / fi! x — l||oo, \\fJ-' x / fJ- x — l||oo} ^ £ f or some ^ e ^ 1 and all x £ K n . 
Then, M' is 3e- differentially private. 

Proof. By our second assumption, 

max f ^M ; ^4M ] < i + £ ^e £ . 

where we used that 1 + e ^ e £ for ^ e ^ 1. 

Now, let x, x' satisfy \\x — x'\\i ^ 1. By the previous inequality, we have 

sup < ~p ^ U) / < e 2 ^ sup d ^ < ^. 

In the last inequality, we used the assumption that M is e- differentially private. Hence, we have 
shown that M 1 is 3e-differentially private. ■ 

Now consider the grid walk with a fine enough grid (say side length (5). It is known that a 
random walk on a grid gets within statistical distance at most A of the uniform distribution in time 
that is polynomial in d, and logA -1 . Setting A to be smaller than the e(f3/d) d , we end up with 
a distribution that is within loo distance at most e from the uniform distribution on the grid points 
in K. Let z be a sample from the grid walk, and let z be a random point from an loo ball of radius 
half the side length of the grid, centered at z. Then z is a (nearly) uniform sample from a body K 
which has the property that (1 - (3)K C K C (1 + (3)K. 



A.l Weak separation oracle 

An r\-weak separation oracle for K' is a blackbox that says 'YES' when given u E 3ft rf with 
(u + nB^) Q K' and outputs 'NO' when u K' + nB^ ■ Here, rj > is some parameter that we can 
typically make arbitrarily small, with the running time depending on r) ■ Our previous discussion 
assumed an oracle for which 77 = 0. Taking 77 = @ ensures that the sample above is (nearly) uniform 
from a body K such that (1 — 2(3)K C K C (1 + 2(3)K. By rescaling, we get the following lemma. 

Lemma A. 2. Let K be a convex body such that B% Q K C dB^, and let (5 > 0. Suppose K is 
represented by a [3-weak separation oracle. Then, there is a randomized algorithm Sample(K, [3) 
running in time poly(d, f3~ l ) whose output distribution is within loo-distance at most f3 from the 
uniform distribution over a body K such that K C K C (1 + (3)K . 

We now argue that such a (nearly) uniform sample from a body close enough to K suffices 
for the privacy guarantee. Our algorithm first samples r ~ Gamma(d + l,e _1 ), and then outputs 
Fx + rz where z is the output of Sample(K, j3) for (3 = mm.(e/d, l/r). 

We can repeat the calculation for the density at a point a in equation (6). Indeed for a point a 
with || a — Fx\\k = R, the density at a conditioned on a sample r from the Gamma distribution, is 
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(l±/3)/Vol(rX) whenever {a/r) G K, and zero otherwise. By our choice of Vol(-ftT) = (l±e)Vol(ET). 
Moreover {a/r) G K for r ^ R and (a/r) G" if for r < i?/(l + /3). Thus the density at a is 

3(a) ^ ^(e + fl r e ~ £ ^ dt _ C 1 ± ( £ + ^)) /fl e ' £ ' dt _ (1 ± (e + /5))e- £fi 



e- d r(d + 1) 7 R Vol(tif) r(d + l)Vo\(e- l K) T(d + l)Vol(e- 1 J RT) ' 

Similarly, (a/r) A" for r < R/(l + 0) implies that g{a) ^ ^ 1 r(d+i)Voi(e-iK)' > • ^ follows that o(a) 
is within an exp(0(e)) factor of the ideal density. 

Finally, the bound on the moments of the Gamma distribution from Fact 2.10 implies that the 
expected running time of this algorithm is polynomial in d, 



B Lower bounds for Differential Privacy with respect to Hamming 
Distance 

While our lower bounds were proved for differential privacy in the ^-metric, the usual notion 
of differential privacy uses Hamming distance instead. In this section we argue that for small 
enough e, our lower bounds can be extended to the usual definition. Let the database be a vector 
w G [n] N where each individual has a private value in [re]. Such a database can be transformed 
to its histogram x = x(w) G where Xi(w) denotes the number of inputs that take value i, i.e. 
Xi(w) = \{j : Wj = i}\. A linear query F on the histogram is a sensitivity 1 query on the database 
w, and a mechanism M is e-differentially private with respect to the Hamming distance on w, if 
and only if it is differentially private with respect to the l\ norm, when restricted to non-negative 
integer vectors x. 

We can then repeat the proof of theorem 3.3, with minor modifications to handle the non- negative 
integer constraint. 

Theorem B.l. Let e > and suppose F G {—1, l} dxn is a linear map and let K = FB™. Then, 
every e- differentially private mechanism M for computing G{w) = Fx{w) must have 

err(M,G) ^ 0(VolLB(F, e)), (22) 

whenever e < cdVol(K ) 1 / d /y / re, for a universal constant c. 

Proof. Let R = Vol(K) 1 ' d . By Fact 3.2 and our assumption, (d/4e)K = F((d/4e)Bf) contains an 
CitaVc^e-packing Y C of size at least exp(d), for some constant C. Let X C (d/4e)-B" be a 
set of arbitrarily chosen preimages of y G Y so that \X\ = \Y\ and FX = Y. 

Now we come up with a similar set X' G Z™. For each x G X, we round each X{ randomly up 
or down, i.e. ii = \xi\, with probability (xi — [xi\), and [xi\ otherwise. It is easy to check that 
E[x] = x. so that with probability 2/3, |x|i ^ 3|x|i. Moreover, E[Fx] = Fx and each random 
choice can change \\Fx\\ by at most \[d. Thus martingale concentration results imply that with 
probability 2/3, \\Fx — Fx\\ 2Vdn. Thus there exists a choice of x so that both these events 
happen. Let v denote the vector (\d/2e~\, [~d/2e], . . . , \d/2e\) and set x' = x + v. This defines our set 
X' which is easily seen to be in ZJt- In fact, I'Cd| (\d/2e~\)Bf. Moreover, for e < CRd/?>2yJn, 
FX' is a Ci?d\/d/8e-packing. 

Now assume that M = {fj, x : x G K" - } is an e-differentially private mechanism with error 
CRd\fd/?>2£ and lead this to a contradiction. By the assumption on the error, Markov's inequality 
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implies that for all x G X' , fi x (B x ) |, where -B x is a ball of radius CRd^fd/lQe centered at Fx. 
By the packing property above, the balls {B x : x E X} are disjoint. 

Since \\x' — v\\i ^ (d/2e), it follows from e-differential privacy with Fact 2.3 that 

fj, v (B x ) ^ ex.p(-s(d/2e))jjL x (B x ) ^ |exp(-d/2). 

Since the balls B x are pairwise disjoint, 

1 ^ fi (U xeX B x ) = Mo(^) > exp(d)± exp(-d/2) > 1 (23) 

for d^ 2. We have thus obtained a contradiction. ■ 

Translating the lower bound from Theorem 5.2 to this setting, we get 

Theorem B.2. Let e G (0, (c-v/ (d/n)) ■ min{\/d, y / log(n/(i)}) /or a universal constant c and let 
d ^ logn. T/ien i/iere exists a linear map F £ {—1, l} dxn suc/j i/iai every e -differentially private 
mechanism M for computing G(w) = Fx{w) must have 

err(M, G) > fi(d/e) • min{ \A>g("/ d )}- (24) 
We remark that this lower bound holds for N = Q,(nd/e). 

C Dilated Ball containment 

Lemma C.l. Let A be a convex body in W 1 such that B^ C A + rB^ for some r < 1. T/ien a 
dilation (1 — r)5^ is contained in A. 

Proof. Let z £ SR d be a unit vector. Suppose that z' = (1 — r)z $ A. Then by the Separating 
Hyperplane theorem (see, e.g., [BV04]), there is a hyperplane H separating z' from A. Thus there 
is a unit vector w and a scalar 6 such that (z 1 , w) — b = and (u,w) — b ^ for all «£ A Let 
i) = z' + rw. Then by triangle inequality, \\v\\ ^ 1. Moreover, 

d(v, A) = inf \\u — v\\ ^ inf (v — u,w) ^ b + r — sup(u, to) ^ r. 

This however contradicts the assumption that that v G S| C A + r-E>2 • Since z was arbitrary, the 
lemma is proved. ■ 
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